BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
we introduce Benchmarking-IR (BEIR), a robust and heterogeneous evaluation benchmark for information retrieval.
We leverage a careful selection of 18 publicly available datasets from diverse text retrieval tasks and domains and evaluate 10 state-of-the-art retrieval systems including lexical, sparse, dense, late-interaction and re-ranking architectures on the BEIR benchmark.
BM25がベースライン